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Abstract 

The  goals  of  this  research  project  were  the  development  of  control  and  estima¬ 
tion  algorithms  for  nonlinear  systems  which  are  computationally  feasible  with  robust 
performance  despite  numerical  and  modeling  errors.  The  approach  was  based  on  the 
recent  generalization  of  linear  worst  case  (#<»)  controllers  to  nonlinear  systems.  The 
construction  of  nonlinear  Hoo  controllers  depends  on  the  solution  of  two  PDE’s  of 
Hamilton- Jacobi  type.  The  first  is  the  one  associated  with  the  problem  of  Hoo  sub- 
optimal  control  by  state  feedback  that  has  appeared  previously  in  the  work  of  several 
authors.  Numerical  methods  to  compute  a  Taylor  series  solution  term  by  term  have 
been  developed.  The  second  PDE  is  a  new  Hamilton-Jacobi  equation  associated  with 
Hoc  suboptimal  estimation.  A  hybrid  computational  method  to  solve  such  problems 
has  been  developed  . 


1  Nonlinear  Control  by  State  Feedback 

In  this  section  an  algorithm  for  computing  term  by  term  a  state  feedback  Hoc  control  law  for 
a  nonlinear  system  is  described.  The  MATLAB  code  to  impliment  this  algorithm  is  avail¬ 
able  from  http://scad.utdallas.edu/scad/  in  the  Nonlinear  Systems  Toolbox.  The  function 
’’hji.m”  impliments  the  following  algorithm. 

Consider  a  nonlinear  system  of  the  form 

x  =  f(x,u).  (1) 

where  the  input  u  consists  of  two  subvectors  u  =  [ttc;  Ud],  a  control  uc  and  a  disturbance  Ud- 
The  goal  is  to  compute  the  feedback  law  for  uc  that  minimizes  the  maximum  over  all  Ud  the 

COSt  ,oo 

j  l(x,u)  dt.  (2) 
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See  [Kl]  for  further  details.  It  is  assumed  that  for  each  x,  l  is  strictly  convex  in  uc,  strictly 
concave  in  u<i- 

When  it  exists  and  is  smooth  the  optimal  cost  7f(x)  starting  at  x  will  satisfy  the  Hamilton- 


Jacobi-Isaacs  PDE 

Uxf  +  l  =  0  (3) 

n  xfu  +  iu  =  o  (4) 

where  for  each  x  the  evaluation  is  at  the  minmax  u  satisfying  (4). 

It  is  further  assumed  that  /,  l  have  series  expansions 

f(x,u)  =  fw(x,u)  + f[2]{x,u)  + f[3]{x,u)  + ...  + f[di(x,u)  (5) 

l(x,u)  =  Zt2](x,u)  +  Zf3Hx,ix)  +  Zf4l(x,W)  +  ...  +  Z[d+1k^«)  (6) 

where  /^(x,  u),  l^(x,u)  denote  homogeneous  polynomials  of  degree  j.  The  leading  terms 
are  of  the  familiar  linear-quadratic  form 

f^(x,u)  =  Ax  +  Bu  (V 

l®(x,u)  =  \(x'Qx  +  2x'Su  +  u'Ru).  (8) 

A 


The  matrix  R  is  symmetric  and  invertible  but  not  positive  definite  because  of  the  assumption 

that  l  is  strictly  convex  in  uc,  strictly  concave  in  ud. 

Following  Al’brekht  [A],  it  is  assumed  that  the  optimal  cost  tt(x)  and  feedback  k(x)  have 


similar  power  series  expansions 

7f(x)  =  7T^  (x)  +  7T^  (x)  +  7T^  (x)  +  .  .  .  +  7T^+1'  (x)  (9) 

u  =  Uc  =  k(x)  =  (x)  +  (x)  +  K®  (x)  +  .  .  .  +  (x)  (10) 

.Ud 

where  the  leading  terms  are 

tt®(x)  =  ^x'Px  (11) 

k^(x)  =  Kx  (12) 


Note  that  the  feedback  is  the  optimal  one  for  both  the  control  uc  and  the  disturbance  ud. 

These  expansions  are  plugged  into  the  Hamilton- Jacobi- Isaacs  equations  and  terms  of 
the  same  degree  are  collected.  The  leading  terms  in  (3)  are  degree  2  and  those  in  (4)  are  of 
degree  1,  and  yield  the  familiar  Riccati  equation  and  state  feedback, 

0  =  PA  +  A'P  +  Q-(PB  +  S)R-\B'P  +  S')  (13) 

K  =  —R~l(B'P  +  S')  (14) 

The  next  step  is  to  collect  terms  in  (3)  of  degree  3  and  those  in  (4)  of  degree  2,  this  yields 
a  system  of  linear  equations  for  tt^,  k[2]  involving  P,  K  which  are  then  solved.  The  process 
is  repeated  through  terms  in  (3)  of  degree  d  -t- 1  and  those  in  (4)  of  degree  d  yielding  linear 
equations  for  7r(d+1l,  that  depend  on  the  lower  degree  solutions. 
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The  call  of  ’’hji.m”  is  of  the  form 

[ka,  fk ,  py,  Ik]  =  hji(f,  l,n,m,d); 

The  input  parameters  /,  l  are  the  dynamics  and  lagrangian  as  above.  More  precisely,  they 
are  matrices  of  coefficients  of  the  various  monomials  in  /,  l  in  block  lexographic  order  as 
described  in  the  Read  Me  First  file  in  the  Nonlinear  Systems  Toolbox.  The  parameters  n,  m 
are  the  dimensions  of  x,  u  respectively  and  d  is  the  degree  of  the  series  expansion  of  /. 

The  first  output  parameter  is  ka  =  k  and  the  third  output  parameter  is  py  =  n  as 
described  above.  More  precisely  they  are  matrices  of  their  coefficients  in  block  lexographic 
order.  The  other  output  parameters  are  the  matrices  of  the  coefficients  of  the  closed  loop 
dynamics  fk(x )  =  f{x,  k(x))  and  the  closed  loop  lagrangian  lk(x)  =  l(x,  k(x)) 

The  routine  ’’hji.m”  differs  from  ’’hjb.m”  by  calling  ’’care.rn”  instead  of  ”lqr2.m”  to  solve 
the  Riccati  equation.  The  routine  ’’care.rn”  can  handle  indefinte  Q  and  R  while  ”lqr2.m” 
requires  that  Q  be  nonnegative  definite  and  R  be  positive  definite.'  The  current  release  of  the 
MATLAB  Control  Toolbox  contains  ”lqr2.m”  and  the  next  release  will  contain  ’’care.rn”. 

Both  ’’hji.m”  and  ’’hjb.m”  can  solve  the  Hamilton- Jacobi  PDE  for  dynamics  and  la- 
grangians  that  depend  on  static  or  dynamic  parameters.  Such  problems  arise  in  designing 
servos  [K2]  and  model  matching  controllers  [K3].  Certain  restrictions  apply.  See  the  com¬ 
ments  in  the  m-files. 


2  Hybrid  Nonlinear  Estimation 

A  hybrid  algorithm  for  a  nonlinear  state  observer  that  utilizes  two  levels  of  computation 
has  been  developed  [KrD].  On  the  higher  level  one  approximately  computes  a  negative  log 
liklihood  function  Q(x,t )  for  the  currnt  state  given  the  past  obseravtions  and  initial  state 
liklihood.  The  most  likely  estimate  of  x(t)  is  x(t)  =  argminQ(x,  t)  At  the  lower  level,  we 
initiate  local  observers  that  resemble  extended  Kalman  filters  at  the  local  minima  of  Q(x,  t ). 
These  are  computed  on  a  much  faster  time  scale.  One  also  computes  how  well  they  explain 
the  observations,  and  takes  as  the  estimate,  the  one  that  best  explains  the  observations  to 
date.  This  algorithm  can  be  suitably  modified  to  calculate  nonlinear  estimators  [Kl]. 

Since  the  computation  of  the  Q  function  is  expensive,  it  is  done  on  a  relatively  coarse 
spatial  and  temporal  grid.  Hence  the  minimum  of  Q  converges  slowly  to  the  true  state  and  is 
never  very  accurate  due  to  the  coarseness  of  the  grid.  The  local  observers  are  computationally 
inexpensive  especially  since  the  filter  gains  are  derived  from  Q  rather  than  solutions  of 
Riccati  equations.  Moreover  when  initialized  close  to  the  true  value,  they  converge  quickly 
and  accurately.  However  if  they  are  initialized  far  from  the  true  value,  they  don’t  always 
converge  to  it.  The  coarse  information  in  Q  allows  one  to  initialize  the  local  observers 
properly. 

Mortensen  [M]  and  Hijab  [H]  introduced  the  concept  of  minimum  energy  estimation. 
Given  an  initial  state  estimate  x°,  an  observation  history  (y(s) :  0  <  s  <  t}  and  an  endpoint 
x  one  seeks  the  minimum  “energy”  triple  x°,  w(s),  v(s)  satisfying 

x(s)  =  f(x(s))  +  w{s) 
y(s)  =  h(x(s))  +  v{s) 
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(15) 

(16) 


(17) 

(18) 


x(0)  =  a:0 
x(t )  =  x. 


The  “energy”  of  the  triple  x°,  w(s),  v(s)  is  defined  as 


1 

2 


e 


a(t~s ) 

w(s) 

2  -at 

ds  +  0 

O 

1 

o 

v(s) 

2 

1  1 

(19) 


Increasing  the  forgetting  factor  a  decreases  the  importance  of  the  initial  state  estimate 
and  earlier  observations  and  increases  the  importance  of  the  later  observations.  The  value 
Q(x,  t )  is  a  measure  of  the  likelihood  that  x(t)  =  x  given  the  initial  state  estimate  and  the 
observations  to  date.  The  smaller  Q(x,t )  is,  the  more  likely  x(t)  =  x.  Let  Q(x,t)  denote 
the  infimum  of  (19)  over  all  triples  satisfying  (15-18),  then  the  minimum  energy  estimate  is 


x(t)  =  argminQ(x,  t). 


(20) 


It  is  not  hard  to  see  that  Q  is  a  solution  in  the  viscosity  sense  of  the  Hamilton  Jacobi  Bellman 
(HJB)  PDE 

oiQ  +  Qt  +  Qxf  +  -^xQ!x  ~  2^  —  ^l2  =  9-  (21) 

For  Hoc  estimation  the  PDE  is  slightly  different,  [Kl], 

Qt  +  Qx  f  +  i^QxQ'x  ~  ^  \y  ~  c\2  +  2  —  =  9  (22) 

Following  Kushner  and  Dupuis  [KD],  we  compute  Q  not  by  approximately  solving  the  HJB 
PDE  but  rather  by  solving  an  approximating  nonlinear  program.  Let  r,  k  be  relatively 
coarse  spatial  and  temporal  steps.  Choose  a  subdomain  of  Rn  where  the  state  is  known  to 
be  and  consider  the  rectangular  lattice  of  points  in  the  subdomain  with  spacing  r.  Following 
the  dynamic  program  approach  (in  forward  time),  we  define  the  approximate  solution  Q(x,  t) 
of  (21)  at  lattice  points  x  and  time  steps  t  by 


Q(x,  t  +  k)  =  inf  {(1  —  ak)Q(z,t) 


(23) 


+ 

+ 


x-z  f(x,t  +  k)  +  f(z,t) 


k 


y{t) 


k 

2 


h(x,  t)  +  h(z,t ) 


k 

2 


Q(x,  0)  =  - 


x  —  X 


(24) 


where  the  infimum  is  over  z  in  the  whole  lattice  in  the  subdomain  or  some  subset  such  as 
the  2 n  nearest  neighbors  of  x. 

Notice  that  the  computation  of  Q  must  be  done  in  real  time  because  of  the  presence  of 
y(t).  The  complexity  of  the  computation  is  inversly  proportional  to  the  spatial  step  r  to  the 
power  of  the  state  dimension  n.  Hence  there  is  a  tradeoff  between  accuracy  (small  r)  and 
computational  ease  (large  r).  Of  course  similar  difficulties  arise  in  all  nonlinear  estimation 
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algorithms,  for  example,  nonlinear  filtering  requires  solving  the  Zakai  stochastic  PDE  in  real 
time. 

The  extended  Kalman  Filtering  is  an  alternative  approach  which  can  be  very  accurate 
when  it  converges.  However  it  may  fail  to  converge  if  the  problem  is  highly  nonlinear.  If  we 
assume  that  the  w,  v  in  (15,16)  are  independent  standard  white  Gaussian  noises  and  the 
initial  state  estimate  is  an  independent  Gaussian  random  vector  with  mean  x°  and  covariance 
P°  then  the  extended  Kalman  Filter  (EKF)  takes  the  form 


X  = 

f(x,t)  +  Ph!x{x,t){y-h{x,t)) 

(25) 

p  = 

fx{x,t)P  +  Pfx{x,t)'  +  I 

—Phx(x,  t)hx(x,  t)P 

(26) 

i(0)  = 

x° 

(27) 

P(  0)  = 

p0 

(28) 

An  example  of  a  highly  nonlinear  problem  where  an  EKF  may  fail  to  converge  is 

X  =  x{\  —  x2) 

(29) 

y  =  x2  +  ex. 

(30) 

If  e  =  0  the  states  rr,  -x  are  indistinguishable  but  for  nonzero  e  the  system  is  observable. 
The  dynamics  has  stable  equilibria  at  x  =  ±1  and  an  unstable  equilibrium  at  x  =  0.  If 
e  >  0,  the  system  is  initialized  near  —1  and  the  EKF  is  initialized  near  1,  the  EKF  will  fail 
to  converge  to  the  true  value  [KrDj. 

Suppose  Q(x,t)  is  a  smooth  solution  to  HJB  PDE  (21),  x(t)  is  a  relative  minimum  of 
Q(x,t )  and  q(t )  =  Q(x{t),t)  then 

o  =  (3D 

0  =  QxxiHtfitfHt)  +  Qxt(x{t),t).  (32) 

If  one  partially  differentiates  (21)  with  respect  to  x  and  evaluates  at  x(t),  t  one  obtains 

0  =  Qtx(x(t),t)  +  Qxx(x(t),t)f 

+hx(x(t),  t)(y  -  h(x(t),  t).  (33) 

From  the  last  two  equations  one  obtains 

x  =  f(x,t)  +Q~£(x,t)tix(x,t)(y  -  h(x,t))  (34) 

and  evaluating  (21)  at  x(t),  t  yields 

q  =  -aq  +  ^\y-h(x,t)\.  (35) 

These  are  the  equations  of  a  local  observer  based  on  Q.  Notice  the  similarity  of  (34)  to  (25) 
of  an  EKF. 
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The  hybid  approach  is  as  follows. 

1)  Compute  Q(x,  t )  by  a  nonlinear  programming  approximation  (23)  on  a  coarse  spatial  and 
temporal  grid, 

2)  At  each  relative  minimum  of  Q(x,t),  initialize  a  local  observer  x(t),  q(t ) 

3)  Let  the  various  local  observers  x(t),q(t)  evolve  according  to  (34,  35)  on  a  fast  time  scale, 

4)  Eliminate  redundant  local  observers  when  they  come  close  together, 

5)  Choose  as  the  current  estimate,  the  x(t)  of  the  local  observer  with  smallest  q(t). 

While  this  algorithm  can  result  in  a  large  number  of  local  observers,  the  computational 

burden  associated  with  each  one  is  quite  small,  less  than  an  EKF .  Each  local  observer 
invloves  integrating  n  +  1  deferential  equations  instead  of  (n2  +  3n)/2  for  an  EKF.  The  total 
computational  burden  associated  with  computing  Q(x ,  t)  on  a  coarse  spatial  and  temporal 
grid  and  computing  many  local  observers  on  a  fine  temporal  scale  is  considerably  lighter 
than  computing  Q{x ,  t)  on  a  fine  spatial  and  temporal  grid.  Moreover  the  accuracy  of  the 
solution  of  the  HJB  PDE  (21)  is  limited  by  the  fineness  of  the  spatial  grid  while  machine 
precision  is  the  limit  on  the  spatial  accuracy  of  a  local  observer.  In  [KrD]  this  observer  is 
applied  to  a  pair  of  examples. 
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